Search CORE

16 research outputs found

Rigid Transformations for Stabilized Lower Dimensional Space to Support Subsurface Uncertainty Quantification and Interpretation

Author: Mabadeje Ademide O.
Pyrcz Michael J.
Publication venue
Publication date: 15/08/2023
Field of study

Subsurface datasets inherently possess big data characteristics such as vast volume, diverse features, and high sampling speeds, further compounded by the curse of dimensionality from various physical, engineering, and geological inputs. Among the existing dimensionality reduction (DR) methods, nonlinear dimensionality reduction (NDR) methods, especially Metric-multidimensional scaling (MDS), are preferred for subsurface datasets due to their inherent complexity. While MDS retains intrinsic data structure and quantifies uncertainty, its limitations include unstabilized unique solutions invariant to Euclidean transformations and an absence of out-of-sample points (OOSP) extension. To enhance subsurface inferential and machine learning workflows, datasets must be transformed into stable, reduced-dimension representations that accommodate OOSP. Our solution employs rigid transformations for a stabilized Euclidean invariant representation for LDS. By computing an MDS input dissimilarity matrix, and applying rigid transformations on multiple realizations, we ensure transformation invariance and integrate OOSP. This process leverages a convex hull algorithm and incorporates loss function and normalized stress for distortion quantification. We validate our approach with synthetic data, varying distance metrics, and real-world wells from the Duvernay Formation. Results confirm our method's efficacy in achieving consistent LDS representations. Furthermore, our proposed "stress ratio" (SR) metric provides insight into uncertainty, beneficial for model adjustments and inferential analysis. Consequently, our workflow promises enhanced repeatability and comparability in NDR for subsurface energy resource engineering and associated big data workflows.Comment: 30 pages, 17 figures, Submitted to Computational Geosciences Journa

arXiv.org e-Print Archive

Mitigation of Spatial Nonstationarity with Vision Transformers

Author: Liu Lei
Prodanović Maša
Pyrcz Michael J.
Santos Javier E.
Publication venue
Publication date: 08/12/2022
Field of study

Spatial nonstationarity, the location variance of features' statistical distributions, is ubiquitous in many natural settings. For example, in geological reservoirs rock matrix porosity varies vertically due to geomechanical compaction trends, in mineral deposits grades vary due to sedimentation and concentration processes, in hydrology rainfall varies due to the atmosphere and topography interactions, and in metallurgy crystalline structures vary due to differential cooling. Conventional geostatistical modeling workflows rely on the assumption of stationarity to be able to model spatial features for the geostatistical inference. Nevertheless, this is often not a realistic assumption when dealing with nonstationary spatial data and this has motivated a variety of nonstationary spatial modeling workflows such as trend and residual decomposition, cosimulation with secondary features, and spatial segmentation and independent modeling over stationary subdomains. The advent of deep learning technologies has enabled new workflows for modeling spatial relationships. However, there is a paucity of demonstrated best practice and general guidance on mitigation of spatial nonstationarity with deep learning in the geospatial context. We demonstrate the impact of two common types of geostatistical spatial nonstationarity on deep learning model prediction performance and propose the mitigation of such impacts using self-attention (vision transformer) models. We demonstrate the utility of vision transformers for the mitigation of nonstationarity with relative errors as low as 10%, exceeding the performance of alternative deep learning methods such as convolutional neural networks. We establish best practice by demonstrating the ability of self-attention networks for modeling large-scale spatial relationships in the presence of commonly observed geospatial nonstationarity

arXiv.org e-Print Archive

Optimal Placement of Public Electric Vehicle Charging Stations Using Deep Reinforcement Learning

Author: Hageman Dylan
Padmanabhan Shankar
Petratos Aidan
Pisel Jesse R.
Pyrcz Michael J.
Ting Allen
Zhou Kristina
Publication venue
Publication date: 23/07/2022
Field of study

The placement of charging stations in areas with developing charging infrastructure is a critical component of the future success of electric vehicles (EVs). In Albany County in New York, the expected rise in the EV population requires additional charging stations to maintain a sufficient level of efficiency across the charging infrastructure. A novel application of Reinforcement Learning (RL) is able to find optimal locations for new charging stations given the predicted charging demand and current charging locations. The most important factors that influence charging demand prediction include the conterminous traffic density, EV registrations, and proximity to certain types of public buildings. The proposed RL framework can be refined and applied to cities across the world to optimize charging station placement.Comment: 25 pages with 12 figures. Shankar Padmanabhan and Aidan Petratos provided equal contributio

arXiv.org e-Print Archive

Texas orphan well road traversal

Author: Jesse Pisel
John Breedis
Michael J. Pyrcz
Publication venue: OSF
Publication date: 01/09/2021
Field of study

Contains data used in the FRI Summer Fellowship 2021 Plug and Abandon project. Contains shapefiles storing data on the roads of Texas as well as orphaned wells within each county

OSF Preprints

Fair train-test split in machine learning: Mitigating spatial autocorrelation for improved prediction accuracy

Author: Garland Lean
Ochoa Jesus
Pyrcz Michael J.
Salazar Jose J.
Publication venue: 'Elsevier BV'
Publication date: 01/02/2022
Field of study

Highlights • Our data split method handles spatial autocorrelation and imposes prediction fairness. • The sets impose fair algorithms with similar difficulty in all machine learning steps. • Kriging variance is a surrogate of spatial prediction difficulty. • The resulting training and test sets are compatible with any machine learning model. Machine learning supports prediction and inference in multivariate and complex datasets where observations are spatially related to one another. Frequently, these datasets depict spatial autocorrelation that violates the assumption of identically and independently distributed data. Overlooking this correlation result in over-optimistic models that fail to account for the geographical configuration of data. Furthermore, although different data split methods account for spatial autocorrelation, these methods are inflexible, and the parameter training and hyperparameter tuning of the machine learning model is set with a different prediction difficulty than the planned real-world use of the model. In other words, it is an unfair training-testing process. We present a novel method that considers spatial autocorrelation and planned real-world use of the spatial prediction model to design a fair train-test split. Demonstrations include two examples of the planned real-world use of the model using a realistic multivariate synthetic dataset and the analysis of 148 wells from an undisclosed Equinor play. First, the workflow applies the semivariogram model of the target to compute the simple kriging variance as a proxy of spatial estimation difficulty based on the spatial data configuration. Second, the workflow employs a modified rejection sampling to generate a test set with similar prediction difficulty as the planned real-world use of the model. Third, we compare 100 test sets' realizations to the model's planned real-world use, using probability distributions and two divergence metrics: the Jensen-Shannon distance and the mean squared error. The analysis ranks the spatial fair train-test split method as the only one to replicate the difficulty (i.e., kriging variance) compared to the validation set approach and spatial cross-validation. Moreover, the proposed method outperforms the validation set approach, yielding a minor mean percentage error when predicting a target feature in an undisclosed Equinor play using a random forest model. The resulting outputs are training and test sets ready for model fit and assessment with any machine learning algorithm. Thus, the proposed workflow offers spatial aware sets ready for predictive machine learning problems with similar estimation difficulty as the planned real-world use of the model and compatible with any spatial data analysis task

OceanRep

Texas oil and gas well database

Author: Allen Ting
Jesse Pisel
Michael J. Pyrcz
Shankar Padmanabhan
Publication venue: OSF
Publication date: 01/09/2021
Field of study

Working database of Texas Railroad Commission data, EIA oil and gas prices, and geological dat

OSF Preprints

Kansas geophysical well logs 2014-2020

Author: Alexander Billiot
Denzell Ford
Jesse Pisel
Michael J. Pyrcz
Publication venue: OSF
Publication date: 01/09/2021
Field of study

OSF Preprints

A Machine Learning Workflow to Support the Identification of Subsurface Resource Analogs

Author: Ademide O. Mabadeje
Jesus Ochoa
Jose J. Salazar
Lean Garland
Michael J. Pyrcz
Publication venue: SAGE Publishing
Publication date: 01/03/2024
Field of study

Identifying subsurface resource analogs from mature subsurface datasets is vital for developing new prospects due to often initial limited or absent information. Traditional methods for selecting these analogs, executed by domain experts, face challenges due to subsurface dataset's high complexity, noise, and dimensionality. This article aims to simplify this process by introducing an objective geostatistics-based machine learning workflow for analog selection. Our innovative workflow offers a systematic and unbiased solution, incorporating a new dissimilarity metric and scoring metrics, group consistency, and pairwise similarity scores. These elements effectively account for spatial and multivariate data relationships, measuring similarities within and between groups in reduced dimensional spaces. Our workflow begins with multidimensional scaling from inferential machine learning, utilizing our dissimilarity metric to obtain data representations in a reduced dimensional space. Following this, density-based spatial clustering of applications with noise identifies analog clusters and spatial analogs in the reduced space. Then, our scoring metrics assist in quantifying and identifying analogous data samples, while providing useful diagnostics for resource exploration. We demonstrate the efficacy of this workflow with wells from the Duvernay Formation and a test scenario incorporating various well types common in unconventional reservoirs, including infill, outlier, sparse, and centered wells. Through this application, we successfully identified and grouped analog clusters of test well samples based on geological properties and cumulative gas production, showcasing the potential of our proposed workflow for practical use in the field

Directory of Open Access Journals

Reconstruction of binary geological images using analytical edge and object models

Author: Abdollahifard
Abdollahifard
Abdollahifard
Arpat
Bastante
Bishop
Bárdossy
Calderon
de Almeida
Deutsch
Dimitrakopoulos
Emery
Feyen
Green
Honarkhah
Keogh
Klise
Knudby
Lee
Li
Mariethoz
Mariethoz
Mariethoz
Mariethoz
Mariethoz
Michael
Mohammad J. Abdollahifard
Peyré
Pyrcz
Pyrcz
Sadegh Ahmadi
Straubhaar
Strebelle
Suzuki
Western
Zhang
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref